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The Common HOL project aims to facilitate porting source code and proofs between members of the 
HOL family of theorem provers. At the heart of the project is the Common HOL Platform, which 
defines a standard HOL theory and API that aims to be compatible with all HOL systems. So far, 
HOL Light and hol90 have been adapted for conformance, and HOL Zero was originally developed 
to conform. In this paper we provide motivation for a platform, give an overview of the Common 
HOL Platform’s theory and API components, and show how to adapt legacy systems. We also report 
on the platform’s successful application in the hand-translation of a few thousand lines of source 
code from HOL Light to HOL Zero. 


1 Introduction 

The HOL family of theorem provers started in the 1980s with HOL88 ||5l, and has since grown to 
include many systems, most prominently HOL4 ifT^ . HOL Light ||8l, ProofPower HOL Q and Is- 
abelle/HOL lIT^ . These four main systems have developed their own advanced proof facilities and 
extensive theory libraries, and have been successfully employed in major projects in the verification of 
critical hardware and software HI [HI and the formalisation of mathematics 0. 

It would clearly be of benefit if these systems could “talk” to each other, specifically if fheory, proofs 
and source code could be exchanged in a relatively seamless manner. This would reduce fhe considerable 
duplicafion of efforl ofherwise required for one sysfem fo benefif from fhe major projecfs and advanced 
capabilities developed on anofher. Work fo dafe has concenfrafed on exchange of proofs via proof objecfs, 
wifh some degree of success, buf liffle has been done fo facilifafe porting of source code. 

The Common HOL Plafform is parf of fhe Common HOL projecf for facilifafing fhe porting of source 
code and proofs befween HOL sysfems. If defines a sfandard HOL fheory compatible wifh fhe core fheory 
of each HOL sysfem, and an applicafion programming inferface (API) of programming componenfs fhaf 
is more-or-less common to all HOL systems. It has so far been supported in HOL Light, hol90 lITSll and 
HOL Zero IH. 

In this paper we give an overview of the platform. In Section 2, we further discuss motivation. In 
Section 3, we cover the platform’s choice of components. In Section 4, we explain how to adapt legacy 
systems to conform to the platform. In Section 5, we report on its successful usage in assisting the 
manual porting of both new and legacy source code. In Section 6, we present our conclusions. 

2 Motivation 

By definition, all systems in the HOL family implement the HOL logic or a close variant. However, 
in practice their commonality stretches far beyond this. They have broadly similar axiomatisations 
of the logic, similar mechanisms for logical extension, similar formal language concrete syntax and 
build up similar foundational theory. Furthermore, in most basic usage at least, they each support 
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similar paradigms of user interaction, namely simple forwards-style application of inference rules and 
backwards-style tactic proofs via the subgoal goal package Iddll . performed in an interactive functional 
programming session. Also, their implementations are all written in variants of the ML functional pro¬ 
gramming language, all employ an LCF-style architecture @ and are all built up from similar libraries 
of programming utilities, syntax utilities, inference rules and tactics. 

Other than in these basic aspects, the systems branch off in their own respects. Each builds up con¬ 
siderable theory beyond the basic foundations in its own way. For example, real numbers in HOL Light 
are constructed quite differently from real numbers in ProofPower HOL. There is also much variation in 
their provision of user proof commands, especially for those relating to proof automation, with each sys¬ 
tem having its own strengths and idiosyncrasies. Most different is Isabelle/HOL, which is implemented 
as an instantiation of the Isabelle generic theorem prover ifTTIl rather than by having its deductive system 
“hardwired” as source code, and supports a variant of the HOL logic that has axiomatic type classes. 
Also, the predominant mode of interaction with Isabelle has become the declarative proof language Isar 
in conjunction with a bespoke IDE, rather than the subgoal package in an interactive ML session. 

Porting proofs between HOL systems by hand involves translating proofs scripts. These proof scripts 
typically involve heavy use of high-level proof commands that differ between systems. In cases where 
such commands are used to finish off subgoals, it is often possible to find a suitably powerful command 
to do the same in the target system, but in other cases proof scripts have to be recreated from scratch. 
Automatic proof porting, via recording of low-level proof steps and export to proof object files, is vastly 
preferable if it can be made sufficiently reliable. Such a capability requires a platform of common 
foundational theory, inference rules and logical extension mechanisms in both systems. 

There have been notable successes in the large scale porting of legacy proofs between HOL systems 
via proof objects. Obua and Skalberg ifT^ developed a capability for porting proofs from HOL4 to 
Isabelle/HOL, using a theory platform based on the HOL4 inference kernel, and then adapted this for 
porting from HOL Light to Isabelle/HOL. Kaliszyk and Krauss ifTOl developed a capability for porting 
from HOL Light to Isabelle/HOL, based on the HOL Light inference kernel. The OpenTheory project Q 
is based around the HOL Light axiomatisation, and establishes a common proof object format for porting 
proofs between various HOL systems, including HOL4, ProofPower/HOL and HOL Light, with ongoing 
work to support Isabelle/HOL. However, these capabilities would all struggle to port something as large 
as the entire Fly speck project fTl. We believe that significant advances in capability can be achieved 
by exploiting a broader commonality that exists between HOL systems, using a platform at a somewhat 
higher level than the inference kernel of one system. 

Porting source code from one system to another currently requires deep knowledge of both systems’ 
implementations and can entail weeks of effort to replicate behaviour sufficiently closely. Naive port¬ 
ing of high-level routines will typically result in unreliable code due to the compounding of small and 
subtle differences in the theory or in ML function behaviour. We know of no pre-existing capability for 
supporting the systematic porting of source code between HOL systems. 

We believe that if the existing HOL systems can be adapted to support a well-designed API that 
reflects the commonality of “primary functionality” (by which we mean functionality directly concerned 
with theorem proving) between the systems, then much of the pain of porting source code can be avoided. 
There is then a platform of precisely corresponding programming components, and source code built on 
this platform in one system can be trivially but accurately ported to another system conforming to the 
same platform. As is also the case for a proof porting capability, both ML components and foundational 
theory have to be taken into account when designing an effective platform. 
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3 Components 

In this section, we give an overview of the components that make up version 0.5 of the Common HOL 
Platform. This is the latest version, and has been implemented for HOL Light and HOL Zero. An earlier 
version was implemented for hol90, but this has not yet been upgraded. Even though the platform has 
not yet been implemented for ProofPower HOL or HOL4, it has been carefully designed with knowledge 
of how these systems work. However, little consideration has so far been given to Isabelle/HOL, which 
presents greater challenges due to its greater differences. A significant redesign of the standard would 
probably be required to properly cater for Isabelle/HOL. 

There is no space in this paper to list all the platform components, let alone to describe each one. 
Instead we provide various tables comparing some corresponding components from hol90, HOL4, Proof- 
Power HOL, HOL Light and HOL Zero. For a given system, each platform component is either exactly 
represented in the system, or it is approximately represented, or it is not represented in the system. In 
our listings, those components only approximately corresponding are written in curly brackets. 

There is not yet a single stand-alone document specifically for fhe purpose of precisely defining each 
plafform componenf. However, part of fhe original mofivafion for fhe HOL Zero system was fo acf as a 
clear demonsfrafion of fhe plafform, and if has been designed fo exacfly conform fo plafform behaviour 
wifhouf adapfion. Readers can download fhe HOL Zero source disfribufion |[T9l . where source code file 
commonhol .mli gives a complete lisf of fhe API componenfs, and fhe user manual appendices give a 
precise descripfion of each API and fheory componenf. 

3.1 Considerations 

Here we discuss some factors fhaf should be faken info considerafion when choosing fhe componenfs. 

Commonality Platform componenfs should broadly reflecl fhe commonalify fhaf exisfs befween fhe 
sysfems. Including componenfs fhaf are only relevanf in one sysfem would enfail exfra effort fo 
make fhe ofher sysfems conformanf, and would be of liffle use fo fhem. Nol including componenfs 
fhaf are common fo all sysfems would mean fhaf basic componenfs from one sysfem would have 
fo be needlessly considered when porting to a largel sysfem. 

Usage Amounf of usage in posf-plafform code should be faken info considerafion when deciding fhe 
plafform componenfs. Heavily used componenfs should almosf qualify by defaull. 

Level The componenfs should be sufficienlly high-level fo be of likely use in posf-plafform source code. 
For example, including low-level subcomponenfs used fo make a HOF ferm parser would be of 
little use, even if fhese componenfs were common fo all HOF sysfems. 

Precision A plafform wifhouf precisely defined componenfs of course loses much of ifs purpose. In HOF 
sysfems, fhere are many small differences in fhe defails of fhe behaviour of various corresponding 
basic functions. For each componenf, fhe plafform should explicifly specify ifs exacf behaviour or 
ofherwise be clear abouf whaf is nol specified. Non-conformanl componenfs musl have plalform- 
conformanl varianfs defined as parf of platform qualificalion. 

Underspecification The API should allow some degree of fiexibilily in cerlain kinds of defails abouf 
if componenfs. For example, fhe MF names of fhe componenfs, or fhe order in which function 
componenfs fake argumenls and whelher fuples or curried form is used. The API should seek fo 
minimise fhe efforl required fo make legacy sysfems conformanf by underspecifying fhese defails, 
which are nol fhe kinds of differences fhaf make porting source code difficull. 


M. Adams 


45 


Completeness The components should he complete in the sense that all primary functionality can he 
huilt from platform components alone. This becomes essential for the constructors and destructors 
of abstract datatypes (such as for HOL types, terms and theorems) because there is otherwise no 
way of manipulating such values. 

Coherence The components should be chosen as a coherent set that categorise in a complete and con¬ 
sistent way and that composes robustly. This makes it easier to write new code based on the API, 
as well as helping portability. 

Performance The API should not exclude components that are important to the performance of a system 
if this means they would otherwise need to be reimplemented in the outer platform in terms of API 
components to result in a significant degradation in performance. 

Ease of Implementation The implementation effort required to conform to a platform is a significant 
consideration. Otherwise, in practice the platform will not get implemented for the full range of 
HOL systems, which defeats its purpose. 

3.2 Theory Components 

The theory components are the axioms, declarations and definitions that must exist in a conformant 
system’s theory. They must form a sufficient basis for building up each HOL system’s theory. 

There is some variation in the systems’ axiomatisations, especially between HOL Light and the other 
systems. Because each system implements the same formal logic, for our purposes of completeness it 
is sufficient to choose the core theory (i.e. the theory of the logical core) of one system as the theory 
platform, and to derive this in the other systems from their respective core theories. The outer platform 
(see Section [4~T] ) in these other systems can then “re-derive” the system’s core theory using the theory 
platform. A platform theorem may be an axiom or definition theorem in one system and a derived 
theorem in another, but as far as the platform is concerned they are all just theorems. 

Our theory platform features the axioms and definitions of ProofPower HOL, which we view as 
the most intuitive, and which are close to those of hol90 and HOL4. It also includes the HOL Light 
definition of the implication operator, which does not feature in the other systems because the behaviour 
of implication drops out from their primitive inference rules and the implication antisymmetry axiom. 
Including this definition means that any of the systems’ primitive inference rule set suffices fo complefe 
fhe deductive sysfem. A handful of fundamenfal fheorems fhaf are common fo buf derived in each sysfem 
are included in fhe plafform, such as fhe frufh fheorem and fhe Law of fhe Excluded Middle, because fhey 
are inevifably needed in implemenfing fhe plafform and so may as well feafure as componenfs. 

The fype consfanfs and consfanfs declared in fhe fheory plafform include fhose from fhe basic fheory 
abouf predicafe logic and lambda calculus fhaf is common fo each HOL sysfem, esfablished in fhe logical 
core and inifial derived fheory of each sysfem. This includes fhe function space fype operafor and fhe 
boolean base fype, plus fhe equalify, conjunction, disjuncfion, implication and logical negafion operators, 
fhe universal, exisfenfial and unique exisfenfial quantifiers and fhe Hilberf choice operafor. 

Beyond fhis, each sysfem builds up essentially equivalenf fheory of pairs, lisfs and nafural numbers. 
To fake advanfage of fhis commonalify, fhe plafform also includes fheory for pairs and nafural numbers, 
including nafural number numerals and 13 classic arifhmefic operators including plus, multiply and ex- 
ponenfiafion. Theory for lisfs does nof currenfly feafure, buf is planned for inclusion in a fulure version. 

The represenfafion of nafural number numerals varies befween HOL sysfems: in HOL Lighf, HOL4 
and HOL Zero, each numeral is consfrucfed using compounding of fwo unary operafors on fhe zero 
consfanf (one for multiplying by fwo and adding one, and one for mulfiplying by fwo and adding zero or 
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hol90 

HOL4 

ProofPower 

HOL Light 

HOL Zero 

"bool" 

"bool" 

"BOOL" 

"bool" 

"bool" 

"fun" 

"fun" 

11_^ II 

"fun" 

II II 

"prod" 

"prod" 

"X" 

"prod" 

"#" 

"ind" 

"ind" 

"IND" 

"ind" 

"ind" 

"num" 

"num" 

"N" 

"num" 

"nat" 

tl'pil 

ll'pil 

II ■J*'* 

II ■J*'* 

"true" 

tipil 

lipil 

lipil 

lipil 

"false" 

(1 — II 

II — 11 

II — II 

II — II 

II — II 

"A" 

"A" 

"A" 

"A" 

"A" 

"\/" 

"\/" 

"V" 

"\/" 




11_1II 


"rv." 

II 1 II 

II 1 II 

"V" 

II 1 II 

II 1 II 

II 7 II 

II 7 II 

II ^ II 

II 7 11 

II 7II 

II? 1 II 

II?! 11 

"3i" 

II? 1 II 

11?! II 

"@" 


"e" 


"(§" 

IMP_ANTISYM_AX 

IMP_ANTISYM_AX+ 

=>_cintisym_axiom 

- 

imp-antisym-ax 

ETA_AX 

ETA_AX 

r]_axiom 

ETA-AX 

eta-ax 

SELECT_AX 

SELECT_AX 

e-axiom 

SELECT-AX 

select-ax 

BQ0L_CASES_AX 

B00L_CASES_AX 

bool_cases_axiom 

B00L-CASES-AX+ 

bool-cases-thm'*' 

INFINITY_AX 

INFINITY_AX 

infinity_axiom 

INFINITY-AX 

infinity-ax 

T_DEF 

T_DEF 

t_def 

T-DEF 

true-def 

FJIEF 

FJIEF 

f-def 

F-DEF 

false-def 

AND_DEF 

AND_DEF 

A_def 

{AND-DEF} 

conj -def 

- 

- 

- 

IMP-DEF 

- 

0R_DEF 

ORJIEF 

V_def 

OR-DEF 

disj-def 

N0T_DEF 

N0T_DEF 

-i_def 

NOT-DEF 

not-def 

F0RALL_DEF 

F0RALL_DEF 

V_def 

FORALL-DEF 

forall-def 

EXISTSJIEF 

EXISTS_DEF 

3-def 

EXISTS-THM"" 

exists-def 

{UEXISTS_DEF} 

{UEXISTSJDEF} 

3] -def 

{UEXISTS-DEF} 

uexists-def 


Table 1: The type constants, some of the constants and some of the theorems (including all the axioms) 
of the theory platform. Derived theorems in a given system are marked with 


two depending on the system), whereas numerals in hol90 and ProofPower HOL form an infinite family 
of constants. However, beyond the definition of a set of basic numeral arithmetic evaluation inference 
rules, these differences do not surface in practice in the implementations of the systems. Thus we have 
abstracted away from the theory platform the detail of how numerals are defined. 

3.3 API Components 

The API componenfs form fhe ML inferface for programming primary funcfionalify. There are approxi- 
mafely 475 componenfs, mainly consisting of ML function and consfanf values, buf also seven dafafypes 
and fhree excepfions. Three configuration values are also provided, fhaf hold fhe HOL sysfem name and 
version and fhe Common HOL Platform version. In each conformanl system, fhe API is provided as an 
ML module inferface file, wifh componenfs given fhe same ordering fo aid comparison befween sysfems. 
Note fhaf fable componenfs fhaf have ML infix fixify in a given sysfem are wriffen in parenfheses. 

3.3.1 Functional Programming Library 

There are around 100 functional programming library componenfs (see Tablefor a selection). 
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hol90 

HOL4 

ProofPower 

HOL Light 

HOL Zero 

curry 

curry 

curry 

curry 

curry 

uncurry 

uncurry 

uncurry 

uncurry 

uncurry 

c 

c 

switch 

c 

swap_arg 

I 

I 

I 

I 

id_fn 

K 

K 

K 

K 

con_fn 

W 

W 

- 

W 

dbl_arg 

(o) 

(o) 

(o) 

(o) 

(<*) 

(##) 

(##) 

(**) 

(F_F) 

pair_apply 

map 

map 

map 

map 

map 

map2 

map2 

- 

map2 

bimap 

{funpow} 

{funpow} 

fun_pow 

{funpow} 

funpow 

itlist 

itlist 

fold 

itlist 

f oldr 

rev_itlist 

rev_itlist 

revfold 

rev_itlist 

foldl 

end_itlist 

end_itlist 

- 

end_itlist 

foldrl 

- 

- 

- 

- 

foldll 


Table 2: Some of the functional programming library API components. 


Included are many basic operations on ML pairs, lists and strings, such as selecting the first element 
of a pair, reversing the order of elements in a list, or turning an integer into a string. Association lists 
are also supported. Also included are various classic functional programming meta operations, e.g. for 
applying a function to each element in a set, or folding up a list into a single element by repeated appli¬ 
cation of a binary operator. There is also a collection of set operations on lists, such as set membership 
and set union, under either equality comparison or a supplied equivalence relation. 

For coherence, we fill out the gaps that exist in the various legacy systems’ libraries. For example, 
all kinds of folding operators and their inverses, unfolding operators, are provided, and all set operations 
are provided for both under equality and a supplied equivalence relation. 

Three kinds of standard exception are catered for: normal failure, catastrophic failure and “local 
failure” (used for control flow within a function). The API underspecifies the form of the exception 
arguments and the textual content of error messages 

Note that there is some variation in the behaviour of some library functions between systems. For 
example, f unpow, which iterates a function application for the number of times specified by a supplied 
integer, does not fail in hol90, HOL4 or HOL Light if the integer is negative. Generally, platform 
functions are specified to fail if supplied with invalid arguments, and the platform version of funpow 
fails if its supplied integer is negative, as is done in ProofPower HOL and HOL Zero. 

3.3.2 Type, Term and Theorem Utilities 

Around 150 HOL type, term and theorem manipulation utilities are provided (see Tablej^for a selection). 

The bulk of these utilities are syntax functions for HOL types or terms, for constructing, destructing 
and testing for a given syntactic category. Two levels of syntactic category are supported for both types 
and terms. Firstly, there are the primitive syntactic categories, namely the type variables and type constant 
applications for types, and variables, constants, function applications and lambda abstractions for terms. 
These are very widely used throughout the HOL implementations. Secondly, there are the basic syntactic 
categories associated with the type constants and constants of predicate logic and lambda calculus that 
feature in the theory platform. Some of these are also used heavily throughout the HOL implementations, 
but we include support for all such syntactic categories in the API for coherence with the theory platform 
and the API inference rules. 
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hol90 

HOL4 

ProofPower 

HOL Light 

HOL Zero 

type_of 

type_of 

type_of 

type_of 

type_of 

type_vars_in_term 

type_vars_in_term 

{term_tyvars} 

type_vars_in_term 

term_tyvars 

aconv 

aconv 

(~=$) 

aconv 

alpha_eq 

- 

rename _bvar 

- 

{alpha} 

rename_bvar 

free_vars 

free_vars 

frees 

frees 

free_vars 

f ree_varsl 

free_varsl 

- 

freesl 

list_f ree_vars 

- 

var_occurs 

is_free_in 

{vfree_in} 

var_f ree_in 

{free_in} 

f ree_in 

- 

free_in 

term_free_in 

all_vars 

- 

- 

variables 

all_vars 

all_varsl 

- 

- 

- 

list_all_vars 

inst 

{inst} 

{inst} 

{inst} 

tyvar_inst 

- 

rename _bvar 

- 

{alpha} 

rename_bvar 

- 

- 

{var_subst} 

vsubst 

var_inst 

{subst} 

{subst} 

subst 

subst 

subst 


Table 3: Some of the term utility API components. 


There are various ML bindings for HOL constants and base types featured in the theory platform, 
and for commonly used HOL type variables. Also included are utilities for destructing a theorem into 
its assumptions and conclusion parts, and for equality and alpha-equivalence comparison of theorems. 
There are also various type and term operations defined that are essential for defining an inference kernel. 
These include calculating the type of a term, listing the type variables of a type, testing for the alpha 
equivalence of two terms, and performing variable and type variable instantiation. 

The platform utilities for HOL terms are generally specified to work modulo alpha equivalence in 
their arguments. This was decided because different systems generate bound variable names differently 
when avoiding variable capture in type variable and variable instantiation, and so this measure makes 
the API functions more robust when ported. An arbitrary bound variable name used in an operation in 
one system could otherwise cause the equivalent operation in another system to fail. Note that hol90’s 
f ree_in, which tests for one term occurring free in another, does not work modulo alpha equivalence, 
and so does not conform to the platform. 

Note that there are various subtle differences between different systems’ utilities that can trip up 
casually ported code. Examples include ProofPower HOL’s ink_const constructor, which does not test 
that a constructed constant is well-formed, and hol90’s and HOL4’s dest.imp and is_imp, which work 
for logical negation as well as implication (although HOL4 has dest_imp_only and is_inip_only for 
implication only). The API chooses more conventional behaviour. 

3.3.3 Theory Extension and Listing Commands 

Around 40 theory extension and querying functions are provided. This includes primitive theory exten¬ 
sion commands for type declaration, term declaration, constant definition, constant specification and type 
constant definition. On top of these, there are a few basic derived theory extension commands, for ex¬ 
ample the command to define a function constant using a universal quantifier for the function arguments 
instead of a lambda abstraction. Most systems have more sophisticated extension commands, but these 
are excluded from the platform because there is much variation in their capability between systems. 

Each system also provides querying commands to access information about the theory extensions 
that have been made, although HOE Eight omits support for querying about primitive type constant 
definitions. Such commands are essential for the approach for proof auditing advocated in ||2, and a 
complete set features in the API. 
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3.3.4 Inference Rules 

Around 100 basic inference rules are provided by the API (see Tablej^for a selection). 

It is sufficient for the platform inference rules to include just a kernel of primitive rule^that suffice, 
when coupled wifh fhe axiom and definition fheorems in fhe fheory plalform, fo implemenf fhe HOL 
deducfive sysfem. Given our choice of fheory plafform, any of fhe sysfems’ primitive inference rules 
would be sufficienl. However, efficiency is also a considerafion. If a primifive rule of a given sysfem 
were missing from fhe API, if would have fo be reimplemenfed in fhaf sysfem’s oufer plafform in ferms 
of fhe API inference rules, and which would in fum need fo be implemenfed in terms of fhe sysfem’s 
primifives. An execufion of such a recreated primifive could require 10 pre-plalform rule applicafions or 
more, resulfing in an unaccepfable performance penally. Thus we choose fo include fhe union of primifive 
rules from each sysfem in fhe plafform (wifh fhe exception of one HOL Lighl primifive explained below). 
This principle qualifies around 35 rules for inclusion in fhe plafform. Nofe fhaf each sysfem excepl HOL 
Zero has primitive rules fhaf are derive able in ferms of olher primitives, bul are included fo improve fhe 
system’s performance, which explains why fhe union includes as many as 35. 

Also included are around 15 olher inference rules al roughly fhe same level as fhe union of fhe 
primifive inference rules, including fhe equalify symmefry rule and fhe cuf rule, for using fhe conclusion 
of one Iheorem fo eliminale an assumpfion in anolher. A furlher 25 rules are included for performing 
equably congruence over certain operafors, in addifion fo fhe Iwo fhaf are presenl as a resull of being 
primifive inference rules. For coherence, Ihese fill oul fhe palchy provision in exisling HOL systems 
wifh full coverage for fhe HOL operafors supported by fhe API synfax functions. 

In addifion, for nalural arilhmefic expressions fhere are conversions provided for performing evalu¬ 
ation of operators applied to numeral argumenls for each of fhe 13 nalural arilhmefic operafors fealured 
in fhe fheory plalform. This is sufficienl to provide complefe coverage of fhe primifive nalural numeral 
arilhmefic inference rules provided by hol90 and ProofPower HOL (which represenf numerals as con- 
sfanfs). This allows fhe plafform to keep abslracl fhe underlying represenfalion of numerals. 

If is vifal fhaf fhe API specifies precise behaviour for each of ils inference rules. There is a degree 
of variation in fhe behaviour of various rules belween sysfems. We oulline here some ways in which fhe 
plalform promoles robuslness in fhe defails of fhe behaviour if specifies for ils inference rules. 

As wifh fhe API’s ferm ulilifies, ils inference rules also work modulo alpha equivalence, for fhe 
same reasons. Note fhaf fhe successful execufion of HOL Lighf’s BETA rule (nol fo be confused wifh ifs 
BETA_C0NV rule) can fail depending on fhe name used for a bound variable in one of ils argumenls, and 
because of Ihis if is excluded from fhe API, despite being a primifive of HOL Lighl. Fortunalely, fhe 
consequences on performance in HOL Lighl are minimal because BETA can be implemenfed purely in 
terms of BETA.CDNV, which is in fhe API. 

If was also decided fhaf API inference rules should nol depend on fhe presence of assumplions in Iheir 
Iheorem argumenls, also fo help robuslness. If is harmless for a rule to remove an assumpfion if if can, 
and Ihis should nol resull in failure in rules composed wifh if. So, for example, fhe rule for discharging 
an assumpfion mafching a supplied ferm should nol fail if fhe assumpfion is nol presenl in fhe Iheorem 
argumenl. Nofe fhaf ProofPower’s classical confradiclion rule c_contr_rule breaks Ihis principle, buf 
olher sysfems’ equivalenls do nol. 

There are also various olher differences in behaviour belween seemingly equivalenl rules in differenl 
HOL sysfems. One parlicularly exfreme case is fhe rule for insfanfialing lype variables, called INST in 
hol90, HOL4 and HOL Lighl, which is a primifive of every HOL sysfem. In hol90, only lype variables 
in fhe conclusion are insfanlialed. In HOL Lighl and HOL4, non-variable lypes in fhe insfanlialion lisf 

*In the paper, we occasionally abbreviate the term inference rule to rule. 
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hol90 

HOL4 

ProofPower 

HOL Light 

HOL Zero 

ASSUME* 

ASSUME* 

asm_rule* 

ASSUME* 

assuiiie_rule* 

BETA_C0NV* 

BETA_C0NV* 

s imple_j8-conv* 

BETA_C0NV 

beta_conv* 

CCONTR* 

CCONTR* 

{c_contr_rule} 

CCONTR 

ccontr_rule 

CHOOSE* 

CHOOSE* 

simple_3_elim 

CHOOSE 

choose_rule 

CONJ* 

CONJ* 

A_intro 

CONJ 

conj _rule 

CONJUNCTl* 

CONJUNCTl* 

A_left_elim 

CONJUNCTl 

conjunct l_rule 

C0NJUNCT2* 

C0NJUNCT2* 

A_right_elim 

C0NJUNCT2 

conjunct2_rule 

CONTR* 

CONTR 

contr_rule 

CONTR 

contr_rule 

- 

- 

- 

DEDUCT_ANTI SYM_RULE* 

deduct_anitsym_rule 

DISCH* 

DISCH* 

=>_intro* 

DISCH 

disch_rule* 

DISJl* 

DISJl* 

V_right_intro 

DISJl 

dis j l_rule 

DISJ2* 

DISJ2* 

V_left_intro 

DISJ2 

disj2_rule 

DISJ.CASES* 

DISJ.CASES* 

V_elim 

DISJ.CASES 

dis j _cases_rule 


Table 4: Some of the inferenee rule API eomponents. Primitive rules in a given system are marked with *. 


argument do not eause failure. And in ProofPower HOL, any free variables that would otherwise beeome 
equal as a result of the instantiation are renamed. None of these idiosynerasies exist in the API version. 

3.3.5 Parsing and Pretty Printing 

Around 20 funetions supporting parsing and pretty printing are provided in the API. This ineludes fune- 
tions for parsing strings into HOL types and terms, and printers for types, terms and theorems. There 
is also support for setting the fixity of HOL funetions and type operators. The fixities supported exeeed 
what is provided by hol90, ProofPower HOL and HOL Light, but do not extend to the full range of 
fixities supported by HOL4. There are plans to extend the platform to support all of HOL4’s fixities. 


4 Implementation 

4.1 Architecture 

For a legacy system to conform to an API, its source code must be adapted so that every component of 
the API is implemented in the system. For the Common HOL API, we use a software architecture for 
adapting legacy HOL systems that is designed with the three goals of minimising implementation effort, 
enabling API-level virtualisation, and facilitating the demonstration that the adapted system exhibits 
precisely the same behaviour as the legacy system. 

To achieve this, we choose an appropriate point in the build of the legacy system that corresponds to 
the level of the API (the platform level), and insert an ML module for the API components (the platform 
module) at this point. All legacy source code occurs either before or after the platform level (respectively 
called the pre-platform and post-platform code) and stays exactly the same. Keeping the pre- and post¬ 
platform code the same makes it easier to argue that the system’s behaviour has not been altered. 

In the platform module, we define fhe API in ferms of pre-plalform funclionalily. Any API compo- 
nenfs nol precisely implemenfed as a pre-plalform componenl musl be implemenled here. This includes 
componenfs missing from fhe legacy sysfem, or wifh imprecisely corresponding equivalenfs in fhe pre- 
plafform code or fhaf are implemenled as posl-plafform code. For any implemenfed as posl-plalform 
code, fhe full free of posl-plafform code used lo define if can be shifted info fhe plalform module, or, if 
Ihis is loo big, Ihen a more succincl version can be implemenled specially for fhe plalform. The code for 
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post-platform API components can then be deleted from its original position in the source code (thus the 
post-platform code remains the same except for deleted code that occurs in the platform module). 

In our architecture, all post-platform code implementing primary functionality is implemented in 
terms of the API. This enables the API to act as a virtualisation layer through which all primary func¬ 
tionality is executed. This virtualisation layer can then be used for recording proofs as they are executed, 
before exporting them to proof objects. In order to achieve this and keep the post-platform code the 
same, we must somehow have a way of referring to pre-platform code that is used by post-platform code 
but is not in the API. We do this by implementing a module immediately after the platform module in the 
build that re-implements all such pre-platform code in terms of the platform, overwriting the pre-platform 
code. We call this the outer platform module. 

In arguing that the system’s behaviour has not altered in the API-adjusted version of the system, we 
must justify why any reimplementation of post-platform code in the platform module, and any reimple¬ 
mentation of pre-platform code in the outer platform module, preserves functionality. 

Given that the API components correspond to classic basic components of a HOL system that tend 
to be implemented towards the start of the build of the system, finding an appropriate insertion point for 
the platform level tends to be fairly straightforward. It is to be found after the definition of the HOL 
type and term datatypes and basic utilities for manipulating them, the inference kernel, the initial theory 
and the parser and pretty printer. It is typically before the derived inference rules for predicate logic and 
the theory for pairs and natural numbers, which would need to be moved to or recreated in the platform 
module. 

4.2 Adapting HOL Light 

We now describe how we adapted HOL Light SVN release 197 to conform to the platform. The reader 
may find if insfrucfive fo download fhe adapfed sysfem IfTSII . 

The plafform level in fhe HOL Lighf build file was chosen befween fhe source files parser .ml and 
equal.ml. Abouf 1,000 lines of posl-plafform code implementing platform componenfs were moved 
info fhe plafform module. Much of fhis was derived inference rules implemented using lemmas proved 
using HOL Lighf’s automated proof facililies. Instead of recreating fhese facililies inside fhe plafform 
module, we employed Common HOL proof porfing to exporf fhe proofs of fhese lemmas as proof objecfs, 
which were fhen hand-franslafed into a fofal of around 400 lines of forwards sfyle proof scripl in fhe 
plafform module. An alfernafive approach was used fo recreate fhe 13 evaluafion rules for nafural numeral 
arifhmefic, whose implemenfafion in calc_num. ml involves lemmas proved in hundreds of lines of proof 
scripl. Instead of exporling proof objecfs for fhese lemmas, fhe inference rules were given a complelely 
differenl implemenfafion in fhe plafform module, porled from HOL Zero in abouf 800 lines. 

Abouf 1,000 lines of code were required fo fill ouf platform components missing from HOL Light. 
For those components with an approximate equivalent already in HOL Light, the existing component 
was used in the implementation of the platform variant (e.g. see Figure [TJ, to ensure that the platform 
variant had roughly the same performance as the original. Those components with no approximate HOL 
Light version were ported from HOL Zero. In total, the components ported from HOL Zero required 
about 1,350 lines of supporting source code to be ported from HOL Zero, mainly involving forwards 
proof to prove lemmas. The platform module interface is written in about 500 lines of code. 

For the outer platform, primitive inference rules and theory commands that do not correspond to 
platform components must be precisely recreated in terms of the platform. In HOL Light, this involves 
the INST_TYPE and BETA rules and all the theory commands. Also, non-platform theorems used to define 
plafform fheory needed fo be recreafed. In fofal, fhe oufer plafform required around 800 lines of code. 
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let INST_TYPE1 theta th = 

let 0 = if (forall (is_vartype o snd) theta) 

then failwith "INST_TYPE: Non-type-variable in instantiation domain" in 
INST_TYPE theta th;; 


Figure 1: Using HOL Light’s original INST_TYPE in the definition of the platform variant. 


Overall, the platform and outer platform modules involved around 6,000 lines of source code, includ¬ 
ing the platform module interface. This took around two weeks of effort to create. The code was mostly 
systematically produced, being either moved from other parts of HOL Light, ported from HOL Zero, 
translated from proof object files, or simply a lisfing of platform components. The only code requir¬ 
ing creative thought was in the platform module variants of components with approximate equivalents 
already in HOL Light, and in much of the outer platform, totalling to around 1,000 lines. 

5 Use Cases 

In this section, we report on two use cases for the Common HOL Platform in assisting manual ports of 
source code between platform-adapted HOL systems. In both cases, the port was from HOL Light to 
HOL Zero. This is on the easy end of the difficulty spectrum in inter-HOL-system code porting, because 
both systems are implemented in the same dialect of ML, i.e. OCaml, and because the target system, 
HOL Zero, is almost a blank canvas with very little post-platform code to consider. Other HOL systems 
have considerable post-platform code, and porting should attempt to reuse any pre-existing code if it is 
straightforward to do so, to avoid creating an almost duplicate stack of supporting functionality in the 
target system. However, both ports described here would still be difficult without the support of the 
platform, and so the use cases provide useful insight. 

5.1 Legacy Code Port: HOL Light Rewriting Mechanism to HOL Zero 

In our first use case, we ported HOL Light’s entire rewriting apparatus to HOL Zero. This is defined rela¬ 
tively early on in HOL Light’s post-platform code, but provides vital functionality that is used throughout 
the rest of the system, and goes far beyond what HOL Zero is capable of in terms of proof automation. It 
is implemented in 360 lines of code, in the HOL Light source file simp. ml, and relies on 60 lines of code 
defining discriminafion nefs, and a further 300 lines of post-platform code defining supporting function¬ 
ality such as conversion combinators. Thus there was a total of 720 lines to port, but this would probably 
be less if porting to another HOL system because it would already support conversion combinators. See 
Figures]^ andfor a sample of 32 lines from the port. 

The manual port was carried out in about 2 hours 30 minutes of effort. Note that this time does not 
include approximately 30 minutes of effort required to extract out the 360 lines of HOL Light supporting 
code prior to the port. The porting itself involved systematically looking up HOL Zero equivalents of 
HOL Light platform functions, and renaming accordingly. HOL Light’s uppercase names, that don’t 
conform to normal OCaml lexical syntax, also needed to be converted to lowercase names. Instantiation 
lists, which have old-to-new ordering in HOL Zero but new-for-old ordering in HOL Light, needed to be 
switched around. The datatype constructors for types and terms, which are visible outside their defining 
module in HOL Light but not in HOL Zero, required some pattern matches to be replaced with abstract 
destructors and if-expressions. The function termjnatch name-clashed with a pre-existing HOL Zero 
function, and so was renamed to hl_termj[iatch. 
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let mk_rewrites = 

let IMP_CONJ_CONV = REWR_CDNV(ITAUT ‘p ==> q ==> r <=> p /\ q ==> r‘) 
and IMP_EXISTS_RULE = 

let cnv = REWR_CONV(ITAUT '(!x. P x ==> Q) <=> (?x. P x) ==> Q') in 
fun V th -> CONV_RULE cnv (GEN v th) in 
let collect_condition oldhyps th = 

let conds = subtract (hyp th) oldhyps in 

if conds = [] then th else 

let jth = itlist DISCH conds th in 

let kth = CDNV_RULE (REPEATC IMP_CDNJ_CDNV) jth in 

let cond,eqn = dest_imp(concl kth) in 

let fvs = subtract (subtract (frees cond) (frees eqn)) (freesl oldhyps) in 
itlist IMP_EXISTS_RULE fvs kth in 
let rec split_rewrites oldhyps cf th sofar = 
let tm = concl th in 
if is_forall tm then 

split_rewrites oldhyps cf (SPEC_ALL th) sofar 
else if is_conj tm then 

split_rewrites oldhyps cf (CONJUNCTl th) 

(split_rewrites oldhyps cf (C0NJUNCT2 th) sofar) 
else if is_imp tm & cf then 

split_rewrites oldhyps cf (UNDISCH th) sofar 
else if is_eq tm then 

(if cf then collect_condition oldhyps th else th)::sofar 
else if is_neg tm then 

let ths = split_rewrites oldhyps cf (EQF_INTR0 th) sofar in 
if is_eq (rand tm) 

then split_rewrites oldhyps cf (EQF_INTR0 (GSYM th)) ths 
else ths 
else 

split_rewrites oldhyps cf (EQT_INTR0 th) sofar in 
fun cf th sofar -> split_rewrites (hyp th) cf th sofar;; 


Figure 2: A sample of legacy source code from HOL Light’s simp. ml. 


HOL Light non-conformant versions of platform functions, such as its variant function, required 
special attention. Unlike the platform equivalent, this function does not fail if its avoidance list contains 
non-variables, and so the code was adapted to either filter them out or check that non-variables are not 
possible from program context. Other complications included two uses of HOL Light’s intuitionistic 
tautology prover, ITAUT. It was decided to keep this function outside the scope of the port, despite it 
being used to prove two lemmas, to reduce the amount of supporting code. For the HOL Zero version, 
one of the lemmas already existed in HOL Zero’s small library of predicate logic theorems, and the other 
was proved in 10 minutes in a 16-line proof using HOL Zero’s forward inference rules. 


After the port was completed, it was tested on various rewriting examples, and one error was found. 
This took 45 minutes of debugging to track down and correct, and was due to a quirk in the failure 
exception returned by HOL Light’s rev_assoc function, which has error message text "find" (instead 
of "rev_assoc"). This particular error message was explicitly trapped in the HOL Light code, but 
naively porting this to HOL Zero didn’t work because its equivalent function, inv.assoc, uses error 
message text "inv_assoc". As explained in Section 3.3. 1[ this aspect of porting is not catered for by the 
platform, and must be done manually. 
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let mk_rewrites = 

let imp_conj_conv = rewr_conv imp_imp_thm 
and imp_exists_rule = 

let cnv = rewr_conv imp_exists_rule_tlim in 
fun V th -> conv_rule cnv (gen_rule v th) in 
let collect_condition oldhyps th = 

let conds = subtract (asms th) oldhyps in 

if conds = [] then th else 

let jth = foldr disch_rule conds th in 

let kth = conv_rule (repeatc imp_conj_conv) jth in 

let cond,eqn = dest_imp(concl kth) in 

let fvs = subtract (subtract (free_vars cond) (free_vars eqn)) 
(list_free_vars oldhyps) in 
foldr imp_exists_rule fvs kth in 
let rec split_rewrites oldhyps cf th sofar = 
let tm = concl th in 
if is_forall tm then 

split_rewrites oldhyps cf (spec_all_rule th) sofar 
else if is_conj tm then 

split_rewrites oldhyps cf (conjunctl_rule th) 

(split_rewrites oldhyps cf (conjunct2_rule th) sofar) 
else if is_imp tm & cf then 

split_rewrites oldhyps cf (undisch_rule th) sofar 
else if is_eq tm then 

(if cf then collect_condition oldhyps th else th)::sofar 
else if is_not tm then 

let ths = split_rewrites oldhyps cf (eqf_intro_rule th) sofar in 
if is_eq (rand tm) 

then split_rewrites oldhyps cf (eqf_intro_rule (gsym_rule th)) ths 
else ths 
else 

split_rewrites oldhyps cf (eqt_intro_rule th) sofar in 
fun cf th sofar -> split_rewrites (asms th) cf th sofar;; 


Figure 3: The translation into HOL Zero of the legacy code sample from simp. ml. 


5.2 New Code Port: HOL Light Proof Importer to HOL Zero 

In the second use case, we used the platform to port HOL Light’s importer for Common HOL proof 
objects. This was a fundamentally easier exercise because the proof importer is written specifically in 
terms of the API, and because Common HOL proof porting works at the level of platform inference rules 
itself. The proof importer is implemented in 2,200 lines of code. 

It took about 1 hour 15 minutes to perform the porting. Despite the source code being three times 
longer than in the legacy code port, it took only half the time. The easier nature of the task meant that 
everything went smoothly first time. The effort consisted almost entirely of systematically applying 
search-and-replace to replace HOL Light platform function names with HOL Zero equivalents and car¬ 
rying out manual adjustments for functions that take their arguments differently in the different systems. 

The resulting source code was tested by importing into HOL Zero the text formalisation part of the 
Flyspeck project, as part of a partial audit of the project as described in ||2l. This involved the tens of 
millions of platform-level inference rule steps. The import into HOL Zero worked first time, suggesting 
the code was ported correctly. 
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6 Conclusions 

In defining a standard for basic theory and programming components, the Common HOL Platform is 
attempting to lay the foundation for much better portability between HOL systems, both in terms of 
porting proofs and porting source code. The feasibility of large scale proof porting has already been es¬ 
tablished by others, but arguably there is scope for doing better still, given a better foundation. However, 
the feasibility of quick and reliable source code porting has not been explored until now. 

In this paper, we have given an overview of the platform’s components and explained the reasons 
behind some of the careful design decisions made. We have also demonstrated using the platform in 
two use cases of manually porting source code from HOL Light to HOL Zero, one for legacy code and 
one for new code written specially for the platform. In both cases, several hundred lines of code were 
successfully and reliably ported within a few hours. Much of the effort normally involved in a manual 
port is removed, because almost ah that needs to be considered is functionality implemented above the 
platform level. Finding corresponding low-level components in the two systems, and the subtle ways 
in which they can differ, has already been taken care of by the platform. As far as we are aware, this 
represents a leap in the productivity of source code porting between HOL systems, even when accounting 
for it being less challenging than the general porting case due to both systems being implemented in the 
same dialect of ML and due to HOL Zero effectively being a blank canvas. 

It would be interesting to see how far HOL source code porting could be pushed. Certainly it is 
feasible to port more challenging parts of HOL Light to HOL Zero. Obvious candidates are the sub¬ 
goal package, the intuitionistic tautology checker and the powerful MES0N_TAC. Implementing the latest 
version of the platform for hol90, HOL4 and ProofPower HOL, and porting to these systems is another 
challenge worth pursuing. The platform has already been designed with these systems in mind, and it 
would at least enable Common HOL proof exporters and importers to be quickly ported to these systems. 

One insight that comes from looking at code from the various HOL systems is how much the subgoal 
package is used in the implementation of other parts of HOL systems, suggesting that it should be part 
of the API. This should be a fairly easy extension to make, since beyond the implementation of an 
initial few tactics, code using it appears to operate at the abstract level using tacticals, rather than use 
the inner workings that differ between HOL systems. Another change worth making is to update the 
platform for the reform to primitive theory extension currently underway in various HOL systems 14]. 
And hnahy, catering for Isabelle/HOL must be a long term priority. This would probably require a 
signihcant overhaul of the platform to ht with such a different system, but if done well it would pay 
dividends to have good portability between the widest used HOL system and the rest of the family. 

The systematic manner in which the porting can be carried out lends itself to automation, or at least 
to partial automation. The most difficult to automate is probably the intelligent use of the target system’s 
legacy supporting code to avoid the ugly situation of creating two parallel stacks of code implementing 
effectively the same thing. Thus partial automation looks a more realistic prospect. We believe there are 
no fundamental difficulties in automatically porting between ML dialects, because the subsets of ML that 
tend to be used in the implementation of HOL systems are trivially corresponding between OCaml and 
SML. So we see there being good prospects for reducing further the time taken to reliably port source 
code, even in more challenging cases. 
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